Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
                                            Some full text articles may not yet be available without a charge during the embargo (administrative interval).
                                        
                                        
                                        
                                            
                                                
                                             What is a DOI Number?
                                        
                                    
                                
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
- 
            The ability to accurately interpret complex visual information is a crucial topic of multimodal large language models (MLLMs). Recent work indicates that enhanced visual perception significantly reduces hallucinations and improves performance on resolution-sensitive tasks, such as optical character recognition and document analysis. A number of recent MLLMs achieve this goal using a mixture of vision encoders. Despite their success, there is a lack of systematic comparisons and detailed ablation studies addressing critical aspects, such as expert selection and the integration of multiple vision experts. This study provides an extensive exploration of the design space for MLLMs using a mixture of vision encoders and resolutions. Our findings reveal several underlying principles common to various existing strategies, leading to a streamlined yet effective design approach. We discover that simply concatenating visual tokens from a set of complementary vision encoders is as effective as more complex mixing architectures or strategies. We additionally introduce Pre-Alignment to bridge the gap between vision-focused encoders and language tokens, enhancing model coherence. The resulting family of MLLMs, Eagle, surpasses other leading open-source models on major MLLM benchmarks.more » « lessFree, publicly-accessible full text available April 24, 2026
- 
            Artificial Intelligence (AI) is poised to revolutionize numerous aspects of human life, with healthcare among the most critical fields set to benefit from this transformation. Medicine remains one of the most challenging, expensive, and impactful sectors, with challenges such as information retrieval, data organization, diagnostic accuracy, and cost reduction. AI is uniquely suited to address these challenges, ultimately improving the quality of life and reducing healthcare costs for patients worldwide. Despite its potential, the adoption of AI in healthcare has been slower compared to other industries, highlighting the need to understand the specific obstacles hindering its progress. This review identifies the current shortcomings of AI in healthcare and explores its possibilities, realities, and frontiers to provide a roadmap for future advancements.more » « lessFree, publicly-accessible full text available December 1, 2025
- 
            Noise and inconsistency commonly exist in real-world information networks, due to the inherent error-prone nature of human or user privacy concerns. To date, tremendous efforts have been made to advance feature learning from networks, including the most recent graph convolutional networks (GCNs) or attention GCN, by integrating node content and topology structures. However, all existing methods consider networks as error-free sources and treat feature content in each node as independent and equally important to model node relations. Noisy node content, combined with sparse features, provides essential challenges for existing methods to be used in real-world noisy networks. In this article, we propose feature-based attention GCN (FA-GCN), a feature-attention graph convolution learning framework, to handle networks with noisy and sparse node content. To tackle noise and sparse content in each node, FA-GCN first employs a long short-term memory (LSTM) network to learn dense representation for each node feature. To model interactions between neighboring nodes, a feature-attention mechanism is introduced to allow neighboring nodes to learn and vary feature importance, with respect to their connections. By using a spectral-based graph convolution aggregation process, each node is allowed to concentrate more on the most determining neighborhood features aligned with the corresponding learning task. Experiments and validations, w.r.t. different noise levels, demonstrate that FA-GCN achieves better performance than the state-of-the-art methods in both noise-free and noisy network environments.more » « less
- 
            Abstract Structural biology efforts using cryogenic electron microscopy are frequently stifled by specimens adopting “preferred orientations” on grids, leading to anisotropic map resolution and impeding structure determination. Tilting the specimen stage during data collection is a generalizable solution but has historically led to substantial resolution attenuation. Here, we develop updated data collection and image processing workflows and demonstrate, using multiple specimens, that resolution attenuation is negligible or significantly reduced across tilt angles. Reconstructions with and without the stage tilted as high as 60° are virtually indistinguishable. These strategies allowed the reconstruction to 3 Å resolution of a bacterial RNA polymerase with preferred orientation, containing an unnatural nucleotide for studying novel base pair recognition. Furthermore, we present a quantitative framework that allows cryo-EM practitioners to define an optimal tilt angle during data acquisition. These results reinforce the utility of employing stage tilt for data collection and provide quantitative metrics to obtain isotropic maps.more » « less
- 
            Real-world networked systems often show dynamic properties with continuously evolving network nodes and topology over time. When learning from dynamic networks, it is beneficial to correlate all temporal networks to fully capture the similarity/relevance between nodes. Recent work for dynamic network representation learning typically trains each single network independently and imposes relevance regularization on the network learning at different time steps. Such a snapshot scheme fails to leverage topology similarity between temporal networks for progressive training. In addition to the static node relationships within each network, nodes could show similar variation patterns (e.g., change of local structures) within the temporal network sequence. Both static node structures and temporal variation patterns can be combined to better characterize node affinities for unified embedding learning. In this paper, we propose Graph Attention Evolving Networks (GAEN) for dynamic network embedding with preserved similarities between nodes derived from their temporal variation patterns. Instead of training graph attention weights for each network independently, we allow model weights to share and evolve across all temporal networks based on their respective topology discrepancies. Experiments and validations, on four real-world dynamic graphs, demonstrate that GAEN outperforms the state-of-the-art in both link prediction and node classification tasks.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                     Full Text Available
                                                Full Text Available